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Abstract 

Background: Breast cancer is one of the leading causes of cancer death for women all over the world and 
mammography is thought of as one of the main tools for early detection of breast cancer. In order to detect the 
breast cancer, computer aided technology has been introduced. In computer aided cancer detection, the detection 
and segmentation of mass are very important. The shape of mass can be used as one of the factors to determine 
whether the mass is malignant or benign. However, many of the current methods are semi-automatic. In this 
paper, we investigate fully automatic segmentation method. 

Results: In this paper, a new mass segmentation algorithm is proposed. In the proposed algorithm, a fully 
automatic marker-controlled watershed transform is proposed to segment the mass region roughly, and then a 
level set is used to refine the segmentation. For over-segmentation caused by watershed, we also investigated 
different noise reduction technologies. Images from DDSM were used in the experiments and the results show 
that the new algorithm can improve the accuracy of mass segmentation. 

Conclusions: The new algorithm combines the advantages of both methods. The combination of the watershed 
based segmentation and level set method can improve the efficiency of the segmentation. Besides, the 
introduction of noise reduction technologies can reduce over-segmentation. 



Background 

Breast cancer is one of the leading causes of cancer 
death for women all over the world [1] and early detec- 
tion is one of the main ways to reduce the death rate of 
the human beings with breast cancer [2-4]. One of the 
ways to detect the breast cancer is to use mammogra- 
phy. Mammography is thought of as one of the most 
effective methods to detect early breast cancer. 
Although mammography is widely used, the rate of cor- 
rect diagnosis of breast cancer using mammography 
needs improvement [5]. Thus, in order to improve the 
diagnosis rate, computer aided diagnosis was proposed 
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to assist the radiologists in the diagnosis of the breast 
cancer and used to improve the diagnosis accuracy [6]. 

In computer aided cancer diagnosis, the detection and 
segmentation of mass are very important. The shape of 
mass can be used as one of the factors to determine 
whether the mass is malignant or benign. In the past, 
many methods for mass segmentation algorithms have 
been proposed. These algorithms include manual segmen- 
tation [7], semi-automatic segmentation [8], and fully 
automatic segmentation [9]. Although manual segmenta- 
tion is considered to be the best mass boundary extraction 
method [10,11], it is time-consuming. Besides, it subjects 
to intra-observer and inter-observer variation [11]. In [12], 
Huo et al. developed a semi-automatic region growing 
approach based on the choice of the starting point by the 
radiologist. In [13], Kobatake et al. applied a modified 
Hough transform to extract lines passing near the centre 
of the mass and automatically selected candidates based 
on the number of line-skeletons. In [14], Lou et al. 
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proposed an algorithm for mass segmentation and the 
algorithm is based on the assumption that the trace of 
intensity values from the breast region to the air-back- 
ground is a monotonic decreasing function. In [15], Zheng 
et aL proposed an algorithm using the difference image 
obtained by subtracting the Gaussian filtered image from 
the original image. In [16], Petrick et al. proposed a 
method for mass segmentation. The basic idea of the pro- 
posed method is to select seeds using local maxima in the 
original image and generate a gradient image using a fre- 
quency-weighted Gaussian filtering. With this image, the 
thresholds of the regions bounded by the edges are 
extracted. In [17], Qi and Snyder proposed a method for 
mass segmentation. They used B'ezier splines to interpo- 
late histograms, from which they extracted the region with 
threshold values at local maxima. In [18], Guliato et al. 
proposed a pixel based algorithm. The proposed algorithm 
aims to preserve the transition between masses and nor- 
mal tissue to segment the mass boundary. In [19], Mudi- 
gonda et al. used multilevel thresholding to detect closed 
edges for mass segmentation. Besides the work mentioned 
above, there is also other work published in [20-22]. 

Although many other results on mass segmentation 
have been published, automatic segmentation of mass is 
still considered difficult because of the ill-defined 
boundaries and overlapping with fibro-glandular tissue 
of many masses [11]. In this paper, we study fully auto- 
matic mass segmentation algorithm. Our basic idea is to 
combine two segmentation algorithms: watershed based 
segmentation algorithm and level set based segmenta- 
tion, As is well known, level set based segmentation 
methods are powerful image segmentation tools and 
have been used for image segmentation for long time 
because they have many advantages, for examples, they 
can handle any of the concavities, splitting, merging and 
so on. Thus they are still used in many fields including 
medical image processing [23]. However, there are sev- 
eral disadvantages on level set based segmentation 
methods. One of the main disadvantages is that the 
computation is costive. Besides, the level set based algo- 
rithms generally need human interaction. In order to 
reduce the interaction, this paper proposes an algorithm 
which combines a fully automatic marker-controlled 
watershed segmentation method with level set based 
segmentation. In the combined algorithm, the segmenta- 
tion results from the watershed are used as the input of 
the level set segmentation and the level set algorithm is 
used to refine the boundary. 

Results 

Experimental materials 

In the experiments, we selected 200 mammograms ran- 
domly from the DDSM database [24] to verify the 



proposed algorithm. For reducing computation cost, we 
resample the original images at a reduced pixel size and 
256 gray levels. The mass location was identified by an 
experienced radiologist and a region of interest (ROI) 
containing the mass was extracted. The selected samples 
contain lesions with different breast-tissue density, dif- 
ferent degrees of subtlety, and different sizes. The distri- 
butions of the size of malignant and benign masses 
overlapped. 100 of the dataset are benign and 100 of 
them are malignant. 

A program was developed using Matlab to run on all 
the test images without user intervention. The results 
show that all cases of segmentation were accurate in 
comparison with the radiologist-marked on the mam- 
mograms. Figure 1 shows some mammograms from 
DDSM and the segmentation results using watershed 
transform and level set based segmentation method. 

Segmentation evaluation 

In the past, there have proposed many segmentation 
evaluation methods, however, segmentation evaluation is 
still an open topic [25,26]. There are mainly two evalua- 
tion methods. One is subjective evaluation, the other is 
objective evaluation. In subjective evaluation, visual 
check is often adopted while the segmentation obtained 
by the computer is evaluated against the segmentation 
obtained by a technician in objective evaluation. In this 
paper, we adopt objective evaluation. The evaluation 
measures used in the paper are [25]: 



Hitting ■■ 



TP 



TP + FN 
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where TP, FP and FN are True Positives, False Posi- 
tives, and False Negatives respectively. Figure 2 shows 
the basic idea of TP, FP and FN of a mass segmentation. 
In Figure 2, TP represents the intersection of the radiol- 
ogist and the algorithm, FP represents the segmentation 
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Figure 1 (a) Original images selected from DDSM; (b) Markers and object boundaries superimposed using watershed algorithm on 
original images; (c) The final segment results based on improved level set. 



results obtained only by the algorithm and the FN 
represents the segmentation results obtained only by the 
radiologist [25] . Hitting denotes the ratio of correct seg- 
mentation, Missing denotes the ratio of missing mass, 



OverHitting denotes the ratio of false mass segmented, 
RelativeHitting denotes relative correct ratio against seg- 
mentation results, and RelativeMissing denotes relative 
missing ratio against segmentation results [25]. 
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TP -True Positives 
FP - False Positives 



□ 
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| | FN -False Negatives 
— Boundary by radiologist 

Boundary by levelset method 



Figure 2 True Positives, False Positives, and False Negatives definition. 



Segmentation results 

The comparisons of the segmentation results between 
the proposed method and the manually segmented 
image by radiologist are shown in Figure 3. In Figure 3, 
the black contours are the segmentation results using 
the proposed algorithm and the green contours are the 
results obtained by a radiologist. From Figure 3, we can 
find that the proposed method can obtain good results. 
We can find that the contours obtained by the proposed 
algorithm are closed to the contours obtained by the 
radiologist and it proves that the proposed algorithm is 
effective. Table 1 and Table 2 show the results of quan- 
titative analysis and from the results we can also prove 
the effectiveness of the proposed algorithm. 

Besides the comparison of the proposed algorithm 
with the human segmentation, we also compared the 
effectiveness of different noise reduction technologies 
for over-segmentation reduction. The comparison 
results are shown in Figure 4. From Figure 4, we can 
find that effectiveness of average filter is worse than 
Gaussian filter while Gaussian filter is worse than aniso- 
tropic diffusion filter. Anisotropic diffusion filter can 
reduce the over-segmentation effectively and thus in the 
proposed algorithm we adopted anisotropic diffusion 
filter. 

Discussion 

In this paper, we propose a mass segmentation algo- 
rithm which combines watershed method and level set 
method. The new method is divided into two steps: a 
marker-controlled watershed transform is first used to 
segment the mass region roughly, and then a level set is 
used to refine the segmentation. 

Watershed based segmentation algorithm has many 
advantages which can overcome the disadvantage in the 
level set based segmentation. As we know, level set 



method usually needs hundreds of iterations to get a 
good segmentation result. With a good initialization 
provided by watershed segmentation, the level set 
method can converge more quickly, thus greatly speed 
up the whole segmentation procedure. Besides, by using 
watershed segmentation as the initialization step, we can 
remove the manual initialization step in general level set 
segmentation and we can obtain a full automatic seg- 
mentation algorithm. 

However, the proposed algorithm still has a few lim- 
itations. In the proposed algorithm, the object to be seg- 
mented is already ROI images which have been 
preliminarily cut from the whole mammograms. Thus a 
mass detection step needs to be merged into the algo- 
rithm in the future. Although Noise reduction technolo- 
gies are introduced into the algorithms, over- 
segmentation still happens on some mammographic 
images. Over-segmentation affects the efficiency of the 
algorithm and thus an effective over-segmentation algo- 
rithm is needed in the future. Another issue is the time 
complexity of the level set. By using the result from 
watershed we can save a lot time but much longer com- 
putation time is still needed to achieve the accurate seg- 
mentation results. 

Conclusions 

In this paper, we have developed a hybrid method to 
segment the mammograms which used watershed algo- 
rithm and level set method. We used watershed trans- 
form to provide a coarse and fast pre-segmentation, 
and used the resultant segmentation as the initial con- 
tour for the level set segmentation. Automatic selec- 
tion of the starting point from watershed transform 
can reduce the user interaction. The combination of 
the two segmentation methods speeds up the entire 
segmentation processing and improves the 
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Figure 3 Flowchart of the result of segmentation algorithm. (a)The final segment results based on improved level set; (b) The region 
marked by the radiologist; (c) The Comparison between (a) and (b). 



segmentation efficiency. Besides, the method has good 
topological adaptability; it can deal with complex and 
changing shapes of the segmentation of the mammo- 
grams well and get high segmentation accuracy. 
Experimental results show that the proposed segmen- 
tation method can obtain good results. 



Method 

Mass segmentation includes two steps in the proposed 
algorithm. The first step is to use watershed transform 
for rough segmentation and the second step is to use 
level set based method to refine the segmentation 
obtained by watershed transform. Watershed based 
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Table 1 The different part Data (pixels) of Fig.3 



CaseNo 



TP 



FP 



FN 



0046 
0051 
0069 
0074 
0123 
0161 
0226 
0274 



4517 
3235 
2913 
12912 
7419 
4339 
1 



1583 



635 
370 
1475 
2611 
1452 
2050 
890 
704 



825 
179 
140 
4654 
2566 
858 
575 



algorithms are mathematical morphology methods for 
image segmentation and they have many advantages in 
comparison with other image segmentation methods. 
For example, watershed transform based segmentation 
methods generally have high computation speed and 
can obtain closed contour lines and accurate position. 
Besides, watershed based image segmentation algorithms 
can handle weak edges very well [27]. 

The basic idea of watershed can be described as fol- 
lows [27]: let % be a gray image, | |V^| | is the gradient 



Table 2 Validation measure Data (percent) of Fig 3 
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Figure 4 (a) The result after different filter; (b) The segment results based on (a). 
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Figure 5 Watershed. 



image obtained from %. In order to segment the objects 
in the image, the foreground markers will be computed 
for the objects. After the markers are obtained, the 
flood waves will propagate from the set of markers to 
cover the topographic surface ||V^|| [27]. When the 
water reaches the maximum gray value, the edges of the 
union of all dams come into being the watershed seg- 
mentation. Figure 5 shows the definition of watershed. 

In the implementation of the watershed algorithm, if 
we only use gradient of watershed for segmentation, 
there are too many ridgelines which will cause over-seg- 
mentation (see Figure 6(b)). In order to reduce the over- 
segmentation, marker-controller watershed is used to 
reduce over-segmentation. In mark based watershed 
method, markers are connected through the component. 
After the marker-based watershed applied, we can get 
Figure 6(c). 

After the image is segmented using watershed trans- 
form, we will use the resultant contour as the initial 
contour for a level set based method to refine the seg- 
mentation. The level set algorithm used for the segmen- 
tation in the proposed algorithm is from [28]. The level 
set algorithm proposed in [28] is based on region based 



active contour model. This model assumes an image is 
formed by two homogeneous regions, and can be for- 
mulated by the following energy functional [29,30]: 

E cv {C,c ll c 2 )=X 1 [ \l 0 {x l y)-c 1 \ 2 dxdy + l 2 f \l 0 {x,y) - c 2 \ 2 dxdy +l jL |C| {X lt k 2 > 0, n > 0) (1) 

Jimide(C) JmuideiC) \ ' 

Where X lf X lt c X) c 2 are constants, C is the evolving 
contour, |C| is the length of contour C, inside(C) and 
outside(C) are the regions inside and outside the 
contour. 

Although the proposed level set method could pro- 
duce successful segmentation, it needs powerful initiali- 
zation techniques. In order to solve the problem, in the 
proposed method, we use the contour obtained from 
watershed segmentation step as the initial contour of 
the level set. We resolve the drawbacks of the two 
methods mentioned above by combining them. 

Besides the initialization issue, there is also noise 
issue. In general, the mammograms have a lot of noise. 
If the watershed algorithm was applied on the image 
directly, over-segmentation will happen because the 
watershed algorithm is very sensitive to noise. To avoid 
over-segmentation, we need to remove the noise. When 
the noise is removed, we can get the coarse segmenta- 
tion using watersheds. The noise reduction methods 
investigated in the proposed paper include average filter, 
Gaussian filter and anisotropic diffusion [31]. Anisotro- 
pic diffusion was introduced by Perona and Malik [31] 
and it uses the gradient between the image area to con- 
trol diffusion degree. Anisotropic diffusion can eliminate 
the noise effectively while preserve the edge of the 
image. The anisotropic diffusion used in the proposed 
algorithm is the method developed in the [32]. 

The proposed algorithm is shown in Figure 7. It is com- 
posed of several steps, the original image will be prepro- 
cessed and then used as the input of the watershed 
segmentation and the rough segmentation is obtained. 
The rough segmentation will be used as the start contour 
for the level set segmentation. This approach combines 
the advantages of the two methods and overcome the dis- 
advantages of each single method: marker-based 
watershed is rough but fast and the level set segmentation 
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Figure 7 Flowchart of the segmentation algorithm. 



needs a certain number of iterations, which produces the 
final, highly accurate, smooth results. 
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